Overview

Dataset info

Number of variables15
Number of observations1274900
Missing cells0 (0.0%)
Duplicate rows154 (< 0.1%)
Total size in memory494.5 MiB
Average record size in memory406.7 B

Variables types

NUM8
CAT5
DATE1
BOOL1

Reproduction info

Date of analysis2020-02-18 09:16:11.107771
Versionpandas-profiling v2.4.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download Configurationconfig.yaml

Warnings

Dataset has 154 (< 0.1%) duplicate rows Warning
DstAddr has a high cardinality: 634 distinct values Warning
dTos has 879713 (69.0%) zeros Zeros
Dur has 19550 (1.5%) zeros Zeros
SrcAddr has a high cardinality: 225 distinct values Warning
SrcBytes has 17444 (1.4%) zeros Zeros
State has a high cardinality: 58 distinct values Warning
sTos has 1251404 (98.2%) zeros Zeros
TotPkts is highly correlated with TotBytesHigh Correlation
TotBytes is highly correlated with TotPktsHigh Correlation
State is highly correlated with DirHigh Correlation
Dir is highly correlated with StateHigh Correlation

Variables

Dir
Categorical

HIGH CORRELATION
Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.7 MiB
->
1192811
<?>
 
58084
?>
 
12357
<->
 
11641
<-
 
7
ValueCountFrequency (%) 
-> 1192811 93.6%
 
<?> 58084 4.6%
 
?> 12357 1.0%
 
<-> 11641 0.9%
 
<- 7 < 0.1%
 

Composition

Contains charsFalse
Contains digitsFalse
Contains whitespaceTrue
Contains non-wordsTrue

Length

Max length5
Mean length4.999994509
Min length4
Scatter

Dport
Real number (ℝ≥0)

Distinct count8
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2327586477370773
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Memory size9.7 MiB
Mini histogram

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile2
Maximum8
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.063326067
Coefficient of variation (CV)0.8625581895
Kurtosis23.74640325
Mean1.232758648
Median Absolute Deviation (MAD)0.4397318218
Skewness4.892853971
Sum1571644
Variance1.130662325
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[1. 1.5 2.5 4.5 5.5 7.5 8. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1 1204282 94.5%
 
5 11146 0.9%
 
8 11143 0.9%
 
7 10742 0.8%
 
6 10451 0.8%
 
3 10180 0.8%
 
4 10068 0.8%
 
2 6888 0.5%
 
ValueCountFrequency (%) 
1 1204282 94.5%
 
2 6888 0.5%
 
3 10180 0.8%
 
4 10068 0.8%
 
5 11146 0.9%
 
ValueCountFrequency (%) 
8 11143 0.9%
 
7 10742 0.8%
 
6 10451 0.8%
 
5 11146 0.9%
 
4 10068 0.8%
 

DstAddr
Categorical

HIGH CARDINALITY
Distinct count634
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.7 MiB
239.255.255.250
 
56385
36.66.107.162
 
28050
103.4.18.170
 
20800
87.106.77.193
 
19959
45.79.186.178
 
19379
Other values (629)
1130327
ValueCountFrequency (%) 
239.255.255.250 56385 4.4%
 
36.66.107.162 28050 2.2%
 
103.4.18.170 20800 1.6%
 
87.106.77.193 19959 1.6%
 
45.79.186.178 19379 1.5%
 
81.88.24.211 19379 1.5%
 
82.165.142.107 19379 1.5%
 
46.163.78.94 19379 1.5%
 
62.75.145.252 19378 1.5%
 
178.79.172.45 19376 1.5%
 
Other values (624) 1033436 81.1%
 

Composition

Contains charsTrue
Contains digitsTrue
Contains whitespaceFalse
Contains non-wordsTrue

Length

Max length35
Mean length13.23139305
Min length7
Scatter

dTos
Real number (ℝ)

ZEROS
Distinct count9
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9461345987920622
Minimum-1.0
Maximum164.0
Zeros879713
Zeros (%)69.0%
Memory size9.7 MiB
Mini histogram

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median0
Q30
95-th percentile0
Maximum164
Range165
Interquartile range (IQR)1

Descriptive statistics

Standard deviation18.44552088
Coefficient of variation (CV)9.478029369
Kurtosis70.71414942
Mean1.946134599
Median Absolute Deviation (MAD)4.414838723
Skewness8.458025833
Sum2481127
Variance340.2372407
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ -1. -0.5 4. 20. 36. 56. 76. 126. 164. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 879713 69.0%
 
-1 374117 29.3%
 
164 15685 1.2%
 
40 3136 0.2%
 
72 1901 0.1%
 
32 171 < 0.1%
 
88 147 < 0.1%
 
80 27 < 0.1%
 
8 3 < 0.1%
 
ValueCountFrequency (%) 
-1 374117 29.3%
 
0 879713 69.0%
 
8 3 < 0.1%
 
32 171 < 0.1%
 
40 3136 0.2%
 
ValueCountFrequency (%) 
164 15685 1.2%
 
88 147 < 0.1%
 
80 27 < 0.1%
 
72 1901 0.1%
 
40 3136 0.2%
 

Dur
Real number (ℝ≥0)

ZEROS
Distinct count517021
Unique (%)40.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean278.54504866455954
Minimum0.0
Maximum3600.0
Zeros19550
Zeros (%)1.5%
Memory size9.7 MiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile0.000176
Q10.420203
median8.998371
Q376.301514
95-th percentile3505.510742
Maximum3600
Range3600
Interquartile range (IQR)75.881311

Descriptive statistics

Standard deviation879.6331257
Coefficient of variation (CV)3.157956424
Kurtosis9.337380911
Mean278.5450487
Median Absolute Deviation (MAD)452.6495024
Skewness3.347057964
Sum355117082.5
Variance773754.4359
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[0.00000000e+00 2.00000000e-06 2.85000000e-05 6.15000000e-05 1.03500000e-04 ... 3.59999524e+03 3.59999670e+03 3.59999915e+03 3.59999988e+03 3.60000000e+03], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 19550 1.5%
 
0.000158 1092 0.1%
 
0.000155 1051 0.1%
 
0.000149 1048 0.1%
 
0.000159 1047 0.1%
 
0.00015 1045 0.1%
 
0.000153 1041 0.1%
 
0.000148 1038 0.1%
 
0.000156 1026 0.1%
 
0.000147 1024 0.1%
 
Other values (517011) 1245938 97.7%
 
ValueCountFrequency (%) 
0 19550 1.5%
 
4e-06 4 < 0.1%
 
5e-06 4 < 0.1%
 
6e-06 3 < 0.1%
 
7e-06 6 < 0.1%
 
ValueCountFrequency (%) 
3600 49 < 0.1%
 
3599.999756 48 < 0.1%
 
3599.999512 46 < 0.1%
 
3599.999268 23 < 0.1%
 
3599.999023 15 < 0.1%
 

Label
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.7 MiB
1
995086
0
279814
ValueCountFrequency (%) 
1 995086 78.1%
 
0 279814 21.9%
 

Proto
Categorical

Distinct count11
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.7 MiB
tcp
1076641
ipv6-icmp
 
118979
udp
 
79032
llc
 
190
46846
 
24
Other values (6)
 
34
ValueCountFrequency (%) 
tcp 1076641 84.4%
 
ipv6-icmp 118979 9.3%
 
udp 79032 6.2%
 
llc 190 < 0.1%
 
46846 24 < 0.1%
 
rtcp 14 < 0.1%
 
47413 12 < 0.1%
 
30718 2 < 0.1%
 
29599 2 < 0.1%
 
24533 2 < 0.1%
 

Composition

Contains charsTrue
Contains digitsTrue
Contains whitespaceFalse
Contains non-wordsTrue

Length

Max length9
Mean length3.560021963
Min length3
Scatter

Sport
Real number (ℝ≥0)

Distinct count8
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.499290140403168
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Memory size9.7 MiB
Mini histogram

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile8
Maximum8
Range7
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.292002967
Coefficient of variation (CV)0.5094143509
Kurtosis-1.238124605
Mean4.49929014
Median Absolute Deviation (MAD)2.000552965
Skewness-0.0004479751242
Sum5736145
Variance5.253277599
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[1. 1.5 7.5 8. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1 160070 12.6%
 
5 159385 12.5%
 
7 159367 12.5%
 
3 159361 12.5%
 
6 159345 12.5%
 
4 159337 12.5%
 
8 159335 12.5%
 
2 158700 12.4%
 
ValueCountFrequency (%) 
1 160070 12.6%
 
2 158700 12.4%
 
3 159361 12.5%
 
4 159337 12.5%
 
5 159385 12.5%
 
ValueCountFrequency (%) 
8 159335 12.5%
 
7 159367 12.5%
 
6 159345 12.5%
 
5 159385 12.5%
 
4 159337 12.5%
 

SrcAddr
Categorical

HIGH CARDINALITY
Distinct count225
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.7 MiB
10.0.2.102
161551
192.168.1.115
 
133465
192.168.1.122
 
114129
192.168.1.124
 
82916
192.168.1.121
 
80099
Other values (220)
702740
ValueCountFrequency (%) 
10.0.2.102 161551 12.7%
 
192.168.1.115 133465 10.5%
 
192.168.1.122 114129 9.0%
 
192.168.1.124 82916 6.5%
 
192.168.1.121 80099 6.3%
 
192.168.1.127 77092 6.0%
 
192.168.1.125 70581 5.5%
 
192.168.1.2 58068 4.6%
 
192.168.1.118 44508 3.5%
 
192.168.1.113 44244 3.5%
 
Other values (215) 408247 32.0%
 

Composition

Contains charsTrue
Contains digitsTrue
Contains whitespaceFalse
Contains non-wordsTrue

Length

Max length35
Mean length13.68688917
Min length2
Scatter

SrcBytes
Real number (ℝ≥0)

ZEROS
Distinct count2852
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.211170511354909
Minimum0.0
Maximum12.277700043602364
Zeros17444
Zeros (%)1.4%
Memory size9.7 MiB
Mini histogram

Quantile statistics

Minimum0
5-th percentile5.153291594
Q15.272999559
median6.762729507
Q36.88550967
95-th percentile7.822044008
Maximum12.27770004
Range12.27770004
Interquartile range (IQR)1.612510111

Descriptive statistics

Standard deviation1.132518694
Coefficient of variation (CV)0.1823357919
Kurtosis10.20576629
Mean6.211170511
Median Absolute Deviation (MAD)0.8665148815
Skewness-2.13105179
Sum7918621.285
Variance1.282598591
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ 0. 2.00366659 4.05910352 4.1270043 4.158761 ... 10.2662365 10.61525953 10.80658415 10.92701362 12.27770004], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
5.272999559 225869 17.7%
 
5.765191103 121148 9.5%
 
6.863803391 109284 8.6%
 
5.153291594 68635 5.4%
 
6.762729507 49613 3.9%
 
6.86171134 46063 3.6%
 
6.96979067 39771 3.1%
 
6.923628628 38813 3.0%
 
6.974478911 30834 2.4%
 
5.752572639 30473 2.4%
 
Other values (2842) 514397 40.3%
 
ValueCountFrequency (%) 
0 17444 1.4%
 
4.007333185 6 < 0.1%
 
4.110873864 160 < 0.1%
 
4.143134726 23 < 0.1%
 
4.17438727 2 < 0.1%
 
ValueCountFrequency (%) 
12.27770004 2 < 0.1%
 
11.82696949 1 < 0.1%
 
11.81362988 1 < 0.1%
 
11.42294605 1 < 0.1%
 
11.35561647 1 < 0.1%
 
Distinct count1274020
Unique (%)99.9%
Missing0
Missing (%)0.0%
Memory size9.7 MiB
Minimum1970-01-01 01:00:00
Maximum1970-03-05 01:19:27.214865
Mini histogram
Histogram
Histogram

State
Categorical

HIGH CARDINALITY
HIGH CORRELATION
Distinct count58
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.7 MiB
FSPA_FSPA
184710
S_
175128
SRPA_SA
152375
SRPA_SPA
 
106300
FSPA_FSRPA
 
95665
Other values (53)
560722
ValueCountFrequency (%) 
FSPA_FSPA 184710 14.5%
 
S_ 175128 13.7%
 
SRPA_SA 152375 12.0%
 
SRPA_SPA 106300 8.3%
 
FSPA_FSRPA 95665 7.5%
 
MRQ 85351 6.7%
 
INT 70411 5.5%
 
FSRPA_SA 58378 4.6%
 
PA_R 56624 4.4%
 
S_RA 53613 4.2%
 
Other values (48) 236345 18.5%
 

Composition

Contains charsTrue
Contains digitsFalse
Contains whitespaceFalse
Contains non-wordsFalse

Length

Max length11
Mean length6.26635893
Min length2
Scatter

sTos
Real number (ℝ)

ZEROS
Distinct count150
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6381465212957879
Minimum-1.0
Maximum255.0
Zeros1251404
Zeros (%)98.2%
Memory size9.7 MiB
Mini histogram

Quantile statistics

Minimum-1
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum255
Range256
Interquartile range (IQR)0

Descriptive statistics

Standard deviation10.21568738
Coefficient of variation (CV)16.00837275
Kurtosis265.8055822
Mean0.6381465213
Median Absolute Deviation (MAD)1.298195917
Skewness16.20275946
Sum813573
Variance104.3602687
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ -1. -0.5 0.5 8.5 14.5 ... 238.5 239.5 247.5 248.5 255. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 1251404 98.2%
 
-1 17676 1.4%
 
164 4266 0.3%
 
40 717 0.1%
 
16 44 < 0.1%
 
225 36 < 0.1%
 
44 31 < 0.1%
 
23 28 < 0.1%
 
24 26 < 0.1%
 
239 24 < 0.1%
 
Other values (140) 648 0.1%
 
ValueCountFrequency (%) 
-1 17676 1.4%
 
0 1251404 98.2%
 
1 2 < 0.1%
 
8 2 < 0.1%
 
9 2 < 0.1%
 
ValueCountFrequency (%) 
255 2 < 0.1%
 
253 2 < 0.1%
 
251 2 < 0.1%
 
249 2 < 0.1%
 
248 19 < 0.1%
 

TotBytes
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count5270
Unique (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.88189896993879
Minimum4.007333185232471
Maximum15.899465026055656
Zeros0
Zeros (%)0.0%
Memory size9.7 MiB
Mini histogram

Quantile statistics

Minimum4.007333185
5-th percentile5.272999559
Q15.926926026
median7.029972912
Q38.010027528
95-th percentile8.371936179
Maximum15.89946503
Range11.89213184
Interquartile range (IQR)2.083101503

Descriptive statistics

Standard deviation1.194238167
Coefficient of variation (CV)0.1735332314
Kurtosis-0.8339599688
Mean6.88189897
Median Absolute Deviation (MAD)1.041201863
Skewness-0.05406001906
Sum8773732.997
Variance1.426204799
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[ 4.00733319 4.05910352 4.1270043 4.158761 4.18953994 ... 13.2662611 13.26876929 13.27863093 13.42343599 15.89946503], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
5.272999559 172265 13.5%
 
6.084499413 121144 9.5%
 
5.424950017 56622 4.4%
 
5.926926026 53381 4.2%
 
6.762729507 49530 3.9%
 
7.448916103 49238 3.9%
 
7.030857476 40380 3.2%
 
8.289539485 38557 3.0%
 
6.06610809 31808 2.5%
 
7.450660796 27587 2.2%
 
Other values (5260) 634388 49.8%
 
ValueCountFrequency (%) 
4.007333185 1 < 0.1%
 
4.110873864 160 < 0.1%
 
4.143134726 23 < 0.1%
 
4.17438727 2 < 0.1%
 
4.204692619 2852 0.2%
 
ValueCountFrequency (%) 
15.89946503 1 < 0.1%
 
14.76333019 1 < 0.1%
 
13.70251654 1 < 0.1%
 
13.43290223 1 < 0.1%
 
13.41396974 1 < 0.1%
 

TotPkts
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count318
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.211478535890647
Minimum0.6931471805599453
Maximum8.708804795117285
Zeros0
Zeros (%)0.0%
Memory size9.7 MiB
Mini histogram

Quantile statistics

Minimum0.6931471806
5-th percentile1.098612289
Q11.609437912
median2.302585093
Q32.890371758
95-th percentile3.401197382
Maximum8.708804795
Range8.015657615
Interquartile range (IQR)1.280933845

Descriptive statistics

Standard deviation0.7070524293
Coefficient of variation (CV)0.3197193271
Kurtosis-0.4210600703
Mean2.211478536
Median Absolute Deviation (MAD)0.5848056383
Skewness0.1047761614
Sum2819413.985
Variance0.4999231378
Histogram
Histogram with fixed size bins (bins=10)
Histogram
Histogram with variable size bins (bins=[0.69314718 0.89587973 1.24245332 1.49786614 1.70059869 ... 6.27381951 6.30718686 6.38350093 6.68986373 8.7088048 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1.945910149 211299 16.6%
 
1.386294361 189402 14.9%
 
2.397895273 169325 13.3%
 
2.944438979 119852 9.4%
 
1.098612289 84014 6.6%
 
2.302585093 72303 5.7%
 
2.995732274 66530 5.2%
 
2.197224577 61206 4.8%
 
2.890371758 59765 4.7%
 
1.609437912 57600 4.5%
 
Other values (308) 183604 14.4%
 
ValueCountFrequency (%) 
0.6931471806 19597 1.5%
 
1.098612289 84014 6.6%
 
1.386294361 189402 14.9%
 
1.609437912 57600 4.5%
 
1.791759469 3221 0.3%
 
ValueCountFrequency (%) 
8.708804795 1 < 0.1%
 
7.731930722 1 < 0.1%
 
7.299797367 1 < 0.1%
 
7.281385664 1 < 0.1%
 
6.716594774 1 < 0.1%
 

Correlations

Missing values

Sample

First rows

DirDportDstAddrdTosDurLabelProtoSportSrcAddrSrcBytesStartTimeStatesTosTotBytesTotPkts
0->100:00:00:00:00:00-1.00.0000000llc100:00:00:00:00:004.1108741970-01-01 01:00:00.000000INT-1.04.1108740.693147
1->1ff02::1:2-1.03479.9926760udp1fe80::6dd3:1409:3456:85629.1268501970-01-01 01:00:06.524077INT0.09.1268504.158883
2->1ff02::1:ff56:8562-1.00.0000000ipv6-icmp1::4.3694481970-01-01 01:00:06.653993NNS0.04.3694480.693147
3->1ff02::2-1.08.0013930ipv6-icmp1fe80::6dd3:1409:3456:85625.3518581970-01-01 01:00:06.654046NRS0.05.3518581.386294
4->1ff02::16-1.00.5002160ipv6-icmp1fe80::6dd3:1409:3456:85625.1984971970-01-01 01:00:06.654282MHR0.05.1984971.098612
5<->18.8.8.80.04.0032341udp710.0.2.1025.0304381970-01-01 01:00:12.595887CON0.05.5012581.386294
6->14.4.4.4-1.03.0042451udp710.0.2.1025.4337221970-01-01 01:00:13.593757INT0.05.4337221.386294
7<->18.8.8.80.00.0012241udp410.0.2.1024.3438051970-01-01 01:00:16.599814CON0.05.1984971.098612
8->1ff02::1:3-1.00.0975910udp5fe80::6dd3:1409:3456:85625.1298991970-01-01 01:02:32.956442INT0.05.1298991.098612
9->1224.0.0.252-1.00.0976881udp810.0.2.1024.8598121970-01-01 01:02:32.956736INT0.04.8598121.098612

Last rows

DirDportDstAddrdTosDurLabelProtoSportSrcAddrSrcBytesStartTimeStatesTosTotBytesTotPkts
1274890->1202.44.54.40.0901.6003420tcp210.0.2.1026.6280411970-01-07 10:19:35.747418RST0.07.2063772.484907
1274891->1202.44.54.40.0901.6895750tcp210.0.2.1026.6280411970-01-07 10:34:37.348084RST0.07.2063772.484907
1274892->1ff02::1:2-1.03479.4606930udp1fe80::6dd3:1409:3456:85629.1268501970-01-07 10:43:59.248623REQ0.09.1268504.158883
1274893->1202.44.54.40.0901.4426880tcp210.0.2.1026.6280411970-01-07 10:49:39.037949RST0.07.2063772.484907
1274894->1202.44.54.40.0901.4596560tcp210.0.2.1026.6280411970-01-07 11:04:40.480971RST0.07.2063772.484907
1274895->1202.44.54.40.0901.0066530tcp210.0.2.1026.6280411970-01-07 11:19:41.941002RST0.07.2063772.484907
1274896->1202.44.54.40.0901.5784910tcp210.0.2.1026.6280411970-01-07 11:34:42.947968RST0.07.2063772.484907
1274897->1ff02::1:2-1.01350.5471190udp1fe80::6dd3:1409:3456:85628.3160561970-01-07 11:48:02.717418REQ0.08.3160563.367296
1274898->1202.44.54.40.0905.2472530tcp210.0.2.1026.6280411970-01-07 11:49:44.526804RST0.07.2063772.484907
1274899->1202.44.54.40.017.1690390tcp210.0.2.1026.5539331970-01-07 12:04:49.751908FIN0.07.1228672.302585